[rocprim] backport from release-staging-7.0#223
Merged
jayhawk-commits merged 8 commits intoJun 18, 2025
Conversation
For #67. Seems like this was overlooked in ROCm/rocPRIM#734.
This change resolves a test failure observed in a future rocThrust test. This is a blocker for adding CCCL 2.7 parity in rocThrust and should be included in ROCm 7.0. It adds 2 changes: - Skip including rocprim_version.hpp when `_CLANGD` is defined. This resolves `clangd` complaining about things. - Fix incorrect masking in DPP warp scan. --- Originally, executing the scan operator was not masked. That means that all lanes could potentially execute. In the case of the failing thrust test, it involved some pointer chasing. My guess is that some optimization logic would still execute the scan operation (read: pointer chasing) even though the input of the operation is optimized out. Since we don't know what the values of those registers are when we do the scan operation, it may chase the pointer to something in the middle of nowhere, causing a crash.
The behaviour of hipGetLastError is changing in HIP 7.0. Previously the error that was reported was cleared on each HIP API call. This means that hipGetLastError reported any error that occurred during the last HIP API call. Moving forward, the error that's reported will only be cleared on each call to hipGetLastError. This means that hipGetLastError will report any error that has occurred since the last call to hipGetError. Some of our tests rely on observing a return value of hipErrorOutOfMemory from hipMalloc when an allocation is too large for a given GPU architecture's memory system. This sets the internal HIP error, and it's not cleared before subsequent tests call hipGetLastError, causing them to fail. This change adds extra calls to hipGetLastError to clear the error (for future tests) in cases where tests run out of memory.
…135) In the batch_memcpy_impl struct, we define two types based on the value of the InputBufferItType template argument. 1. input_type, set to the InputBufferItType's underlying value_type's value_type. 2. Alias, a type that's set using a std::conditional statement that examines the IsMemCpy boolean template arg, and may be set to either unsigned char or the type from (1). Later code creates a reference to (1). This causes problems when (1) is void. This change removes the definition of (1), since it does not appear to be used anywhere else in our repos (it's part of the detail namespace, so it's not public). However, this is not enough to fix the problem, since the compiler evaluates both sides of the std::conditional we use to define (2). To work around this, this change also adds a helper struct type, (AliasType) that uses template specialization on the IsMemCpy boolean to define the type that Alias will be assigned. This allows us to remove the std::conditional statement.
This change did not seem to make it in this PR ROCm/rocPRIM#727.
`detail::raw_storage` is still used quite often due to its performance relative to other APIs. Let's undeprecate this until the alternatives are equally performing.
stanleytsang-amd
approved these changes
Jun 18, 2025
spolifroni-amd
approved these changes
Jun 18, 2025
assistant-librarian Bot
pushed a commit
to ROCm/rocPRIM
that referenced
this pull request
Jun 18, 2025
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Includes the following PRs: